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Abstract 

We consider exploration problems where a robot has to construct a com- 
plete map of an unknown environment. We assume that the environment 
is modeled by a directed, strongly connected graph. The robot's task is to 
visit all nodes and edges of the graph using the minimum number R of edge 
traversals. Koutsoupias [16] gave a lower bound for R of £2(d 2 m), and Deng 
and Papadimitriou [12] showed an upper bound of d 0( - d) m, where m is the 
number edges in the graph and d is the minimum number of edges that have 
to be added to make the graph Eulerian. We give the first sub-exponential 
algorithm for this exploration problem, which achieves an upper bound of 
d°^° %d) m. We also show a matching lower bound of d n< - logd ^m for our al- 
gorithm. Additionally, we give lower bounds of 2 a{d) m, resp. d n( - logd ^m for 
various other natural exploration algorithms. 

1 Introduction 

Suppose that a robot has to construct a complete map of an unknown environment 
using a path that is as short as possible. In many situations it is convenient to 
model the environment in which the robot operates by a graph. This allows to 
neglect geometric features of the environment and to concentrate on combinatorial 
aspects of the exploration problem. Deng and Papadimitriou [12] formulated thus 
the following exploration problem. A robot has to explore all nodes and edges of 
an unknown, strongly connected directed graph. The robot visits an edge when 
it traverses the edge. A node or edge is explored when it is visited for the first 
time. The goal is to determine a map, i.e. the adjacency matrix, of the graph using 
the minimum number R of edge traversals. At any point in time the robot knows 
(1) all visited nodes and edges and can recognize them when encountered again; 
and (2) the number of unvisited edges leaving any visited node. The robot does 
not know the head of unvisited edges leaving a visited node or the unvisited edges 
leading into a visited node. At each point in time, the robot visits a current node 
and has the choice of leaving the current node by traversing a specific known or an 
arbitrary (i.e. given by an adversary) unvisited outgoing edge. An edge can only be 
traversed from tail to head, not vice versa. 

If the graph is Eulerian, 2m edge traversals suffice [12], where m is the number 
of edges. This immediately implies that undirected graphs can be explored with 
at most Am traversals. For a non-Eulerian graph, let the deficiency d be the min- 
imum number of edges that have to be added to make the graph Eulerian. Deng 
and Papadimitriou [12] suggested to study the dependence of R on m and d and 
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showed the first upper and lower bounds: they gave a graph such that any algo- 
rithm needs Q (d 2 m/ log d) edge traversals, and they also presented an algorithm 
that achieves an upper bound of d°^m. Koutsoupias [16] improved the lower 
bound to Q{d 2 m). Deng and Papadimitriou asked the question whether the expo- 
nential gap between the upper and lower bound can be closed. Our paper is a first 
step in this direction: we give an algorithm that is sub-exponential in d, namely 
it achieves an upper bound of d°^ ogd ^m. We also show a matching lower bound 
for our algorithm and exponential lower bounds for various other exploration algo- 
rithms. 

Note that d arises also in the complexity of the "offline" version of the prob- 
lem: Consider a directed cycle with one edge replaced by d + 1 parallel edges. On 
this graph any Eulerian traversal requires Q, (dm) edge traversals. A simple modi- 
fication of the Eulerian online algorithm solves the offline problem on any directed 
graph with 0(dm) edge traversals. 

Related Work. Exploration and navigation problems for robots have been 
studied extensively in the past. The exploration problem in this paper was for- 
mulated by Deng and Papadimitriou based on a learning problem proposed by 
Rivest [19]. Betke et al. [8] and Awerbuch et al. [1] studied the problem of ex- 
ploring an undirected graph and requiring additionally that the robot returns to its 
starting point every so often. Bender and Slonim [9] showed how two cooperating 
robots can learn a directed graph with indistinguishable nodes, where each node 
has the same number of outgoing edges. Subsequent to the work in [12], Deng et 
al. [11] investigated a geometric exploration problem, whose goal is to explore a 
room with or without polygonal obstacles. Hoffmann et al. [15] gave an improved 
exploration strategy for rooms without obstacles. More generally, theoretical stud- 
ies of exploration and navigation problems in unknown environments were initiated 
by Papadimitriou and Yannakakis [18]. They considered the problem of finding a 
shortest path from a point s to a point t in an unknown environment and presented 
many geometric and graph based variants of this problem. Blum et al. [7] inves- 
tigated the problem of finding a shortest path in an unfamiliar terrain with convex 
obstacles. More work on this problem includes [2, 5, 6]. 

Our Results. Our main result is a new robot strategy, called Balance, that 
explores an arbitrary graph with deficiency d and traverses each edge at most (d + 
see Section 3. The algorithm does not need to know d in advance. 
The total number of traversals needed by the algorithm is also 0 (minora, dn 2 + 
m}), where n is the number of nodes. At the end of Section 3 we show that any 
exploration algorithm that fulfills two intuitive conditions achieves an upper bound 
of 0(min{nm, dn 2 + m}). A depth-first search strategy obtaining this bound was 



2 



independently developed by Kwek [17]. 

In Section 4 we demonstrate that our analysis of the Balance algorithm is tight: 
There exists a graph that is explored by our algorithm using d Q Q°z d) m edge traver- 
sal. We also show that various variants of the algorithm have the same lower 
bound. In Section 2, we present lower bounds of 2 a ^m, resp. d Q(logd) m for vari- 
ous other natural exploration algorithms to give some intuition for the problem. 

Our exploration algorithm tries to explore new edges that have not been visited 
so far. That is, starting at some visited node x with unvisited outgoing edges, the 
robot explores new edges until it gets stuck at a node y, i.e., it reaches y on an 
unvisited incoming edge and y has no unvisited outgoing edge. Since the robot 
is not allowed to traverse edges in the reverse direction, an adversary can always 
force the robot to visit unvisited nodes until it finally gets stuck at a visited node. 

The robot then relocates, using visited edges, to some visited node z with un- 
explored outgoing edges and continues the exploration. The choice of z is the 
only difference between various algorithms and the relocation to z is the only step 
where the robot traverses visited edges. To minimize R we have to minimize the 
total number of edges traversed during all relocations. It turns out that a locally 
greedy algorithm that tries to minimize the number of traversed edges during each 
relocation is not optimal: it has a lower bound of iP'^m (see Section 2). 

Instead, our algorithm uses a divide-and-conquer approach. The robot explores 
a graph with deficiency d by exploring d 2 subgraphs with deficiencies d/2 each and 
uses the same approach recursively on each of the subgraphs. To create subgraphs 
with small deficiencies, the robot keeps track of visited nodes that have more vis- 
ited outgoing than visited incoming edges. Intuitively, these nodes are expensive 
because the robot, when exploring new edges, can get stuck there. The relocation 
strategy tries to keep portions of the explored subgraphs "balanced" with respect 
to their expensive nodes. If the robot gets stuck at some node, then it relocates to 
a node z such that "its" portion of the explored subgraph contains the minimum 
number of expensive nodes. 

2 Lower bounds for various algorithms 

In this section we give lower bounds of 2 Q ^m, resp. d n(lo ^ d) m for a locally 
greedy, a generalized greedy, a depth-first, and a breadth-first algorithm. A related 
problem for which lower bounds have been studied extensively, is the s-t connec- 
tivity problem in directed graphs, see [3, 4, 14] and references therein. Given a 
directed graph, the problem is to decide whether there exists a path from a dis- 
tinguished node s to a distinguished node t. Most of the results are developed 
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in the JAG model by Cook and Rackoff [10]. The best time-space tradeoffs cur- 
rently known [4, 14] only imply a polynomial lower bound on the computation 
time if no upper bounds are imposed in the space used by the computation. Given 
the current knowledge of the s-t connectivity problem it seems unlikely that one 
can prove super-polynomial lower bounds for a general class of graph exploration 
algorithms. 

In the following let G be a directed, strongly connected graph and let v be a 
node of G. Let in(v) and out{v) denote the number of incoming, resp. outgoing 
edges of v. Let the balance bal(v) — out{v) — in(v). For a graph with deficiency 
d there exist at most d nodes 57, 1 < i < d, such that bal{sj) < 0. Every node 
Sj with bal{si) < 0 is called a sink. Note that —H s bai(s)<Q^ a ^ s ^ — d. We 
use the term chain to denote a path. A chain is a sequence of nodes and edges 
x\, (xi,x 2 ), x 2 , (X2, x 3 ), . . . , (Xk-i,Xk), xu for k > 1. 

Greedy: If stuck at a node y, move to the nearest node z that has new outgoing 
edges. 

Generalized-Greedy: At any time, for each path in the subgraph explored so 
far, define a lexicographic vector as follows. For each edge on the path, determine 
its current cost, which is the number of times the edge was traversed so far. Sort 
these costs in non-increasing order and assign this vector to the path. Whenever 
stuck at a node y, out of all paths to nodes with new outgoing edges traverse the 
path whose vector is lexicographic minimum. 

Depth-First: If stuck at a node y, move to the most recently discovered node 
z that can be reached and that has new outgoing edges. 

Breadth-First: Let v be the node where the exploration starts initially. If 
stuck at a node y, move to the node z that has the smallest distance from v among 
all nodes with new outgoing edges that can be reached from y. 

Theorem 1 For Greedy, Depth-First, and Breadth-First and for every d, there ex- 
ist graphs of deficiency d that require iP'^m edge traversals. For Generalized- 
Greedy and for every d, there exists a graph of deficiency d that requires d a(l ° sd ^m 
edge traversals. 

Proof: Greedy: Basically Greedy fails since it is easy to "hide" a subgraph. When- 
ever Greedy discovers this subgraph, the adversary can force it to repeat all the 
work done so far. 

The graph G consists of two parts, (1) a cycle Co of three edges and nodes v, 
v 1 (Cq), and t> 2 (Co), and (2) a recursively defined problem P d . A problem P s , for 
any integer 8 > 2, is a subgraph that has two incoming edges whose startnodes do 
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not belong to P b but whose endnodes do, and 8 outgoing edges whose startnode 
belongs to P s but whose endnodes do not. A problem P l is defined in the same 
way as a problem P s , 8 > 2, except that P l has only one incoming edge. In the 
case of P d , the two incoming edges start at v 1 (Cq) and v 2 {Cq), respectively; the d 
outgoing edges all point to v. 

For the description of P s we also need recursively defined problems Q s . These 
problems are identical to P s except that, for 8 > 2, Q s has exactly 8 incoming 
edges. 

A problem P s , 8 — 1,2, consists of 8 chains of three edges each. The first 
edge of each chain is an incoming edge into P s ; the last edge of each chain is an 
outgoing edge. A problem Q s , 8 = 1, 2, is the same as P s . 

We proceed to define P s , for 8 > 2. One of the incoming edges of P s is the 
first edge of a chain D s consisting of three edges, the other incoming edge is the 
first edge of a long chain C s . For each of these chains C s and D s , the last edge is 
an outgoing edge of P s . If 8 — 3, the last interior node of each of the chains C s and 
D s has an additional outgoing edge pointing into a problem P 1 . If 8 > 4, (a) the 
last two interior nodes of C s each have an additional outgoing edge pointing into 
a subproblem P s ~ 2 , (b) the last two interior nodes of D s each have an additional 
outgoing edge pointing into a subproblem Q s ~ 2 . There are 8 — 2 edges leaving 
P s ~ 2 , exactly min{0, 8 — 4} of which point to nodes of Q s ~ 2 such that each node 
in Q s ~ 2 that has k more outgoing than incoming edges, for k > 0, receives k 
incoming edges from P s ~ 2 . The remaining outgoing edges of p s ~ 2 point to the 
interior nodes of D s that have additional outgoing edges. The problem Q s ~ 2 has 
8 — 2 outgoing edges all of which are outgoing edges of P s . The total number of 
edges in C s is 2 plus the number of edges of D s plus the total number of edges 
contained in the subproblem Q s ~ 2 below D . 

A problem Q s , 8 > 2, is the same as P s except that the subproblem P s ~ 2 is 
replaced by another Q s ~ 2 problem. That is, Q s is composed of chains C s , D s and 
problems Q s { ~ 2 , i — 1, 2. As mentioned before, Q s has exactly 8 incoming edges. 

Greedy is started at node v and traverses first chain Co. Then it either explores 
C d or D d . In either case, afterwards Greedy explores all edges of Q d ~ 2 since C d 
is prohibitively long. Thus, p d ~ 2 is "hidden" from Greedy. We exploit this in the 
analysis: Let A^(<5) be the number of times that Greedy explores edges of a problem 
P s or Q s , gets stuck at some node and cannot relocate to a suitable node by using 
only edges in P s resp. Q s . We show that -/V(<5) > 2 s ? 2 . Since the edge leaving v is 
traversed every time the algorithm cannot relocate by using only edges in P d , the 
bound follows. 

A problem P s contains two subproblems P s ~ 2 and Q s ~ 2 . Note (a) that, be- 
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Figure 1 : The graph for Greedy 



cause of chain D s , no node in Q s ~ 2 can reach a node of P s ~ 2 without leaving P s . 
Note (b) that Q s ~ 2 is completely explored when the exploration of P s ~ 2 starts and 
all paths starting in p s ~ 2 lead through D s or Q s ~ 2 . Thus, every time Greedy gets 
stuck in a subproblem p s ~ 2 or Q s ~ 2 and has to leave p s ~ 2 resp. Q s ~ 2 in order to 
resume exploration, it also has to leave P s . For Q s ~ 2 the statement follows from 
(a); for P s ~ 2 it follows from (a) and (b). In the same way we can argue for a prob- 
lem Q*. Thus, N(8) > 2N(8 - 2). Since, for 8 = 1,2, N(8) > 1, we obtain 
N(8) > 2 & / 2 . 

This implies that the edge e on Co leaving v is traversed 2 s2 ^ times. The 
desired bound follows by replacing e by a path consisting of 0(m) edges. 

Depth-First: We can use the same graph as in the case of the Greedy algorithm. 
Depth-First will explore all edges in Q d ~ 2 before it will start exploring P d ~ 2 . 

Breadth-First: Again we can use the same graph as in the lower bound for 
Greedy. The last two interior nodes of C d have a larger distance from the initial 
node v than all nodes on D d and in Q d ~ 2 . Thus Q d ~ 2 is finished before Breadth- 
First starts exploring P d ~ 2 . 

Generalized-Greedy: The graph used for the lower bound is outlined in Fig- 
ure 2. The basic idea in the lower bound construction is as follows. Generalized- 
Greedy explores each subgraph Q\ and its sibling R\ "in parallel". Without loss 
of generality we can assume that the last chain traversed in the two subgraphs lies 
in <2f and the algorithm continues to explore Q v i+l and Rj +1 . Let N(y) denote 
the number of times that the algorithm has to leave Rj and traverse the root. We 
will show that N(4y) > N(y), which implies that the root has to be traversed 
N(d) > d n ^ d) times. 

To be precise we show the bound for d being a power of 4. The bound for all 
values of d follows by "rounding" down to the largest power of 4 smaller than d. 
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Figure 2: The graph for Generalized-Greedy 



The graph G consists of two parts, (1) a cycle Co with nodes v, v '(Co) and v 2 (C 0 ), 
and (2) a recursively defined subproblem P d . Problem P d has two incoming edges, 
one starting at v l (Co) and one starting at u 2 (Co). It also has d outgoing edges, all 
pointing to v. The subproblem P d is a union of chains C, each of which consists of 
three edges, a startnode, an endnode and two interior nodes v l {C) and v 2 (C). The 
interior nodes have at most one additional outgoing edge. We proceed to define P s 
and the "sibling" graphs Q s and R s , for all S < d that are a power of 4, and then 
show the lower bound on this graph. 

A problem P s , 8 > 1, is a graph with two incoming edges and exactly 8 out- 
going edges. A problem R s , 8 > 1, consists of P s with 8 — 2 additional incoming 
edges. The problem Q s consists of R s with two additional incoming and two addi- 
tional outgoing edges. 

8 — 1 : A problem P 1 consists of one chain. The incoming edge of P 1 is the 
first edge of the chain, and the outgoing edge of P 1 is the last edge of the chain. 
In P 1 , the interior nodes of the chain have no additional outgoing edges, in Q l 
each interior node has one additional incoming and one additional outgoing edge. 
Problem R 1 is equal to P 1 . 

8 — 4: A problem P 4 consists of two subproblems P/ and P 2 \ and chains 
C\ and D[, whose first interior nodes have one additional outgoing edge. The 
outgoing edge of C\ is the incoming edge of P/ and the corresponding edge of D\ 
is the incoming edge of Pj. The last edge of C\ and of D\ and the outgoing edges 
of P/ and P 2 ! are outgoing edges of P 4 . A problem P 4 is P 4 with two additional 
incoming edges, one at the startnode of P/ and one at the startnode of Pj. A 



problem Q 4 is R 4 with two additional incoming and outgoing edges; each interior 
node of P/ has an additional incoming and outgoing edge. 

. - 



Figure 3: The subproblem P 4 

8 — 4 l , for some I > 2: Let y = 8/4. It is simpler to describe Q s first. 
The construction is depicted in Figure 4. Every node has the same indegree as 
outdegree, i.e., there are no sinks. Problem Q s consists of subproblems Q\ and 
Rj , for 1 < i < y , connected by chains Cj and D\ , for 1 < i < y, whose interior 
nodes each have an additional outgoing edge. 

The C-chains and <2 -subproblems are interleaved as follows. The two edges 
leaving the interior nodes of C\ point into Q\ . In general, the edges leaving the 
interior nodes of Cj point into Q\ . The same holds for the D-chains and R- 
subproblems. The first edge of C\ and of Dj are incoming edges of Q s , for 
i — 1, and start in <2f_i> for 1 < i < y, on a node of the leftmost subproblem 
Q l contained in Recall that this problem consists of one chain with two 

additional incoming and outgoing edges. One of these outgoing edges is the first 
edge of Cj and the second outgoing edge is the first edge of Dj . 

Additionally, the subproblems are connected as follows. Recall that y edges 
leave Rf . For i — 1, the edges leaving Rj are outgoing edges of Q r . For 1 < i < 
y, two edges leaving Rj point to the interior edges of Dj_ { . Additionally, there are 
y —2 edges leaving R? and pointing into R v i _ l such that every node in Rf_ l that 
has k more outgoing than incoming edges, for k > 0, receives k edges from Rj . 
The same holds for with Cj_ y . The problem Q V Y has y incoming edges which 
are incoming edges for Q , the problem R V Y has y — 2 incoming edges which are 
incoming edges for Q s . 

There are Ay + 2 — 8 + 2 outgoing edges in Q s : The last edge of C\ and the 
last edge of Dj , for 1 < i < y, all edges leaving R\, all but two edges leaving Q\ 
(the other two are the incoming edges of and C\), and two edges leaving Qy. 
There are also 8 + 2 incoming edges: the first edge of C\ and of D\, the edges 
pointing to the two interior nodes of Cy and Dy , the y incoming edges of Qy, the 
y — 2 incoming edges of Ry, and 2y —2 incoming edges ending at the startnodes 
of Cj and Df, for 2 <i < y. 

A problem P s consists of 2y chains Cj and Dj , 1 < i < y, as well as 
two subproblems P? , y < i < y + 1, and 2{y — 1) subproblems 2^ and Rf , 
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1 < i < Y — 1. These components are assembled in the same way as in Q s , except 
that Qy is replaced by Py +X , and R V Y is replaced by Py . Problems Py and P^ +1 
each have only two incoming edges from Cy and Dy , respectively. 

There are Ay — 8 outgoing edges in P s : The last edge of Cf and the last edge 
of Dj , for 1 < i < y , all but two edges leaving Q \ (the other two are the incoming 
edges of D\ and C\ ), all all edges leaving T?}'. There are two incoming edges in 
P s . The first edge of C\ and of D[ are incoming edges in every problem P s . The 
following 8 — 2 nodes are sinks for P s : the two interior nodes of Cy and of Dy, 
the 2y — 2 startnodes of and Dj , for 2 < i < y, the y — 2 sinks of Py and the 
y — 2 sinks if Py +V 

A problem R s is a problem with an incoming edge into all sinks of P s . 
Thus there are 8 incoming and 8 outgoing edges. 




Figure 4: The subproblems Q s and P s 

We analyze Generalized-Greedy on G. For simplicity we only discuss the 
exploration of a problem Q s . The argument for P s and R s is analogous. As before, 
let y — 8/4. We show inductively that the symmetric construction of Qj and Rj 
attached to C\ and D\ as well as the definition of Generalized-Greedy imply that 
Q\ and Rj are explored symmetrically. That is, during two consecutive traversals 
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of C (in order to resume exploration in Qj or R?), Generalized-Greedy proceeds 
once into Q v ; and once into R? , where C is the chain at which chains Cj and D v - 
start. This obviously holds for i — 1 . Assume it holds for i and we want to show 
it for i + 1. Note that <2f and differ only in the last chain that Generalized- 
Greedy explores in gf, rep. T?? 7 . Thus, until the traversal of the earlier of the last 
chain of Q Y { and the last chain of Rj , Generalized-Greedy does not distinguish 
Qj from Rj . Hence we can assume without loss of generality that Generalized- 
Greedy traverses first the last chain of Rj and afterwards the last chain of Qj . 
(Think of an adversary "giving" to Generalized-Greedy first the last chain of Rj 
and then the last chain of .) Then Generalized-Greedy explores Cj +l and D? +1 
and afterwards Q v i+l and Rj +1 symmetrically. Thus, when Generalized-Greedy 
explores a subproblem Rj, 1 < i < y, subproblems Rj with 1 < j < i are 
already finished. 

Whenever Generalized-Greedy gets stuck in Rj , 1 < i < y , and has to leave 
Rf in order to resume exploration, it also has to leave the "parent problem" Q s 
(or P s , R s ). This is because the chains Dj , 1 < / < y, prevent the algorithm 
from reaching a chain in Qj , 1 < j < i, from where unfinished chains in Q s , 
(P s , R s ) can be reached. On the way from Rj to an outgoing edge of the parent 
problem, Generalized-Greedy can traverse problems Rj , j < i. As shown above, 
the subproblems are finished, no further exploration of Rj is possible. The same 
arguments hold when the algorithm gets stuck in a problem Py . 

For any 8, 4 < 8 < d, let N(8) be the number of times Generalized-Greedy 
generates a chain in P s or R s , gets stuck and has to leave P s or R s in order to 
continue exploration. Then N(S) > yN(y) = 8/4N(S/4). Since N(l) > 1, we 
have N(d) > d a(logd) and hence the edge leaving node v is traversed d a(logd) 
times. ■ 



3 The Balance algorithm 
3.1 The algorithm 

We present an algorithm that explores an unknown, strongly connected graph with 
deficiency d, without knowing d in advance. First we give some definitions. At 
the start of the algorithm, all edges are unvisited or new. An edge becomes visited 
whenever the robot traverses it. A node is finished whenever all its outgoing edges 
are visited. The robot is stuck at a node y if the robot enters a finished node y on 
an unvisited edge. A sink is discovered whenever the robot gets stuck at the sink 
for the first time. We assume that whenever the robot discovers a new sink, the 
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subgraph of explored edges is strongly connected. This does not hold in general, 
but by properly restarting the algorithm at most d times the problem can be reduced 
to the case described here. Details are given in the Appendix. 

Assume the algorithm knew the d missing edges (s\, t\), (S2, ti), ■ ■ ■ , (sd, td) 
and a path from each 57 to tj . Then a modified version of the Eulerian algorithm 
could be executed: Whenever the original Eulerian algorithm traverses an edge 
(57, ti), the modified Eulerian algorithm traverses the corresponding path from 57 
to ti. Obviously, the modified algorithm traverses each edge at most 2d + 2 times. 
Thus, the problem is to find the missing edges and corresponding paths. 

Our algorithm tries to find the missing edges by maintaining d edge-disjoint 
chains such that the endnode of chain i is 57 and the startnode of chain i is our 
current guess of tj . As the algorithm progresses paths can be appended at the start 
of each chain. At termination, the startnode of chain i is indeed . To mark chain 
i all edges on chain i are colored with color i. 

The algorithm consists of two phases. 

Phase 1: Run the algorithm of [12] for Eulerian graphs. Since G is not Eu- 
lerian, the robot will get stuck at a sink s. At this point stop the Eulerian graph 
algorithm and goto Phase 2. The part of the graph explored so far contains a cycle 
Co containing s [12]. We assume that at the end of Phase 1 all visited nodes and 
edges not belonging to Co are marked again as unvisited. 

Phase 2: Phase 2 consists of subphases. During each subphase the robot visits 
a current node x of a current chain C and makes progress towards finishing the 
nodes of C. The current node of the first subphase is s, its current chain is Co- The 
current node and current chain of subphase j depend on the outcome of subphase 
7-1. 

A chain can be in one of three states: fresh, in progress, or finished. A chain C 
is finished when all its nodes are finished; C is in progress in subphase j if C was a 
current chain in a subphase j' < j and C is not yet finished; C is fresh if its edges 
are explored, but C is not yet in progress. 

At the same time up to d + 1 chains in progress and up to d fresh chains 
can exist. The invariant that there are always at most d + 1 chains in progress 
is convenient but not essential in the analysis of the algorithm. The invariant that 
there exist always at most d fresh chains in crucial. Every startnode of a fresh chain 
has more visited outgoing that visited incoming edges and, thus, the robot can get 
stuck there. In the analysis we require that there always exist at most d such nodes. 

The algorithm marks the current guess for tj with a token r, , for 1 < i < d. 
In fact, every startnode of a fresh chain represents the current guess for some ti , 
1 < i < d, and thus has a token r, . To simplify the description of the relocation 



11 



process, each token is also assigned an owner which is a chain that contains the 
node on which the token is placed. Note that a node can be the current guess for 
more than one node tj and, thus, have more than one token. 

^From a high-level point of view, at any time, the subgraph explored so far is 
partitioned into chains, namely Co and the chains generated in Phase 2. During the 
actual exploration in the subphases, the robot travels between chains. While doing 
so, it generates or extends fresh chains, which will be taken into progress later, and 
finishes the chains currently in progress. 

We give the details of a subphase. First, the algorithm tests if x has an unvisited 
outgoing edge. 

1 . If x does not have an unvisited outgoing edge and x is not the endnode of C, 
then the next node of C becomes the current node and a new subphase is started. 

2. If x has no unvisited outgoing edge and x is the endnode of C, procedure 
Relocate is called to decide which chain becomes the current chain and to 
move the robot to the startnode z of this chain. Node z becomes the current 
node. 

3. If x has unvisited outgoing edges, the robot repeatedly explores unvisited edges 
until it gets stuck at a node y. Let P be the path traversed. We distinguish four 

cases: 

Case 1: y — x 

Cut C at x and add P to C. See Figure 5. The robot returns to x and the next 
phase has the same current node and current chain. 




Figure 5: Case 1 

Case 2: y x, y has a token r,- and is the startnode of a fresh chain D (see 
Figure 6) 

Append P at D to create a longer fresh chain, and move the token from y to 
x . The current chain C becomes the owner of the token, the previous owner 
becomes the current chain, and y becomes the current node. 

Case 3: y ^ x, y has a token r,- but is not the startnode of a fresh chain. 

This is the same as Case 2 except that no fresh chain starts at y. The algorithm 

creates a new fresh chain of color i consisting of P. It moves the token from y 
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Figure 6: Case 2 



to x and C becomes the owner of the token. The previous owner of the token 
becomes the current chain and y becomes the current node. 

Case 4: y ^ x and y does not own a token. 

In this case bal(y) < 0. If bal(y) = —k, then this case occurs k times for y. 
Let i be the number of existing tokens. The algorithm puts a new token r, + i on 
x with owner C, creates a fresh chain of color i + 1 consisting of P (the first 
chain with color i + 1), and moves the robot back to s. The initial chain Co 
becomes the current chain, s becomes the current node. 

This leads to the algorithm given in Figure 7. We use x to denote the current 
node, C to denote the current chain, k the number of tokens used, and j the highest 
index of a chain. Lines 4-17 of the code correspond to item 3 above. Line 6 and 7 
correspond to Case 1, lines 8-13 correspond to Cases 2 and 3, and lines 14-16 to 
Case 4. Lines 18 and 19 implement item 2 and item 1, respectively. 

Additionally, the algorithm maintains a tree T such that each chain C corre- 
sponds to a node v{C) of T and v(C') is a child of v{C) if the last subpath ap- 
pended to C' was explored while C was the current chain. Reversely, we use C{v) 
to denote the chain represented by node v. We use T v to denote the subtree of T 
rooted at v and say C is contained in T v if v(C) lies in T v . We also say a token r or 
an edge e is contained in T v if owner (r), respectively the chain of e is contained 
in T v . If all chains in T v are finished, we say that T v is finished. To represent T, the 
algorithm assigns a parent to each chain. 

To relocate the robot needs to be able to move on explored edges from the end- 
point of a chain C to its startnode. This is always possible, since at the beginning 
of each subphase the explored edges form a strongly connected graph. To avoid 
that an edge is traversed often for this purpose, we define for each chain C a path 
closure(C) connecting the endnode of C with the startnode of C such that an edge 
belongs to closure(C) for at most d°^ ogd) chains C. Finally, we will show that 
closure(C) is traversed at most 0(d 2 ) times. 

A path Q is called a C-completion if it connects the endnode of a chain C with 
the startnode of C. A path Q in the graph is called i-uniform if it is a concatenation 
of chains of color i. Let u be a node of T. A path Q in the graph is T u -homogeneous 
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Algorithm Balance 

1. j : = 0, k := l,x := s, C := C 0 . 

2. repeat 



3. while C is unfinished do 

4. while 3 new outgoing edge at x do 

5. Traverse new edges starting at x until stuck at a node y. Call this path P. 

6. if y — x then 

7. Insert P into C; 

8. else if y has a token then 

9. if 3 chain D of color i starting in y and D is fresh then 

10. C :— owner (ti). Concatenate P with D; 

11. else 

12. j := j + 1; C, :— chain that consists of P; 

13. Place tj on x; ow«er(r,) := C; x := y; C := C; 

14. else (* y ^ x and y has no token *) 

15. j :— j + l; Cj :— chain that consists of P; 

16. k :— k + 1; Place token noni; owner^) := C; x := s; C := Co; 

17. Move robot to x ; 

18. Move robot to first unfinished node z that appears on C after its startnode; 

x :— z; 

19. C :— Relocate(C); x = startnode of C; 



20. until C — empty _chain. 

Figure 7: The Balance algorithm 

if any maximal subpath R of Q that does not belong to T u is (a) /-uniform for some 
color i ; (b) the edge of Q preceding R is the last edge of a chain of color i ; and (c) 
the edge of Q after R is the first edge of a chain of color i. 

We try to choose closure{C) to be "as local to C" as possible: Let 5(C) be the 
set of explored edges when C becomes the current chain for the first time. Given 
5(C), a(C) is the lowest ancestor of v(C) in T such that a T a (C) -homogeneous 
completion of C exists in 5(C). Note that a(C) is well-defined since each chain 
has a T v (c 0 ) -homogeneous completion. The path closure{C) is an arbitrary T a (Q- 
homogeneous completion of C using only edges of 5(C). The algorithm can com- 
pute closure(C) whenever C becomes the current chain for the first time without 
moving the robot. 

We describe the Relocation procedure, see Figure 8. In the relocation step, the 
robot repeatedly moves from the current chain to its parent until it reaches a chain 
C such that T V (C) is unfinished. To move from a chain X to its parent X', the robot 
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proceeds along X to the endnode of X and traverses closure(X) to the startnode 
of X, which belongs to X' . When reaching C, the robot repeatedly moves from 
the startnode of the current chain X to the startnode of one of its children until it 
reaches the startnode of an unfinished chain. It chooses the child X' of X such that 
among all subtrees rooted at children of X and containing unfinished chains, T V (x') 
has the minimum number of tokens. 

Procedure Relocate(C) 

1. if all chains are finished then return(empty_chain). 

2. else Move robot to the startnode of C along closure(C); 

3. while C ^ Co and T v (q is finished do 

4. Move robot to the startnode of parent (C) along closure{parent {C))\ 

5. C :— parent(C); 

6. while C is finished do 

7. Let C\, C2, . . ■ , C[ be the chains with parent(Ct) = C, 1 < k < I. Let 
Ck be the chain such that T v (c k ) contains the smallest number of tokens 
among all T v (d), ■ ■ ■ , T V (Q) having unfinished chains; 

8. C :— Ck \ x :— startnode of C; 

9. Move robot to x ; 

10. if C is not in progress then 

1 1 . Compute closure( C); 

12. return(C) 

Figure 8: The Relocation procedure 

3.2 The analysis of the algorithm 
3.2.1 Correctness 

Since the graph is strongly connected, all nodes of the graph must be visited during 
the execution of the algorithm. When the algorithm terminates, all visited nodes 
are finished. Thus, all edges must be explored. We show next that each operation 
and each move of the robot are well-defined. Proposition 1 shows that if a chain of 
color i is fresh, then r, lies at the startnode of the chain. Thus, in line 10, token r, 
lies on y. By assumption there exists a path from any finished node to s. Thus, the 
move in line 17 is well-defined. In line 18, the robot moves to the next unfinished 
node of the current chain C. It would be possible to walk along closure( C), but the 
proof of Lemma 4 shows later that closure( C) is not needed. 
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3.2.2 Fundamental properties of the algorithm 

Lemma 1 At most d tokens are introduced during the execution of the Balance 
algorithm. 

Proof: We say that the algorithm first introduces the token r& at y in line 16. 

Let in v (v) and out v {v) denoted the number of visited incoming and visited out- 
going edges of v, respectively. Let t (y) be the total number of tokens introduced on 
node v inline 16. We show inductively that max{in v (v)—out v (v) , 0} — t (v). Since 
at termination in v {v) — in{v) and out v (v) = out{v), it follows that —bal{v) > 
t(v) if bal(v) < Oand t(v) = 0, otherwise. Thus,d — —J2 V with bai(v)<obal{v) > 

The claim max{m„(u) — out v (v), 0} — t{v) holds initially. Let P be the newly 
explored path when the first token is placed on v, i.e. when the algorithm gets 
stuck at v for the first time. Before P enters v, in v {v) = out v {v). Traversing P 
increments in v (y) by 1 and sets in v (v) — out v {v) — 1. Thus, the claim holds. Let 
P be the newly explored path when token i is placed on v. It follows inductively 
that in v {v) — out v {v) = i — 1 before P enters v and traversing P increments the 
value by 1 as before. ■ 

We prove next some invariants. 

Proposition 1 1. For every chain C that is in progress or finished, parent(C) 
is finished. 

2. Let C be a chain of color i, 1 < i < d. (a) If C is fresh, C does not own a 
token, Ti is located at the startnode of C , and parent(C) = owner(ti). (b) If 
C is in progress and not the current chain, then C is the owner of some token 
x. 

3. Every chain C is the parent of at most d chains. 

Proof: Part 1. Procedure Relocate ensures that parent(C) is finished before C is 
taken into progress. 

Part 2a. When C is first created in line 12 or 15 of Balance, r,- is placed on the 
startnode of C. Whenever the robot gets stuck at the current startnode of C and 
removes r, , chain C is extended by a path P because C is not in progress. Token 
T; is placed on the new startnode of C. Lines 13 and 16 ensure that the parent of C 
is always the owner of r,- . 

Part 2b. We show that whenever C is the current chain and Balance leaves C 
to continue work on an other chain, C becomes the owner of a token. Chain C 
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is unfinished. Thus, if C is the current chain, Balance can only leave C to con- 
tinue work on an other chain during lines 5-17 of the algorithm. In this situation, 
Balance places a token on a node of C and C becomes the owner of that token. 

Part 3. Chain C can become the parent of other chains while C is in progress 
and unfinished. During this time, every chain C with parent(C') — C is unfinished 
and not in progress, see Part 1. By Part 2a, the startnode of such a chain C holds a 
token and C is the owner of that token. Since there are only d token, the proposition 
follows. ■ 

The next lemma shows that our algorithm always balances the number of to- 
kens contained in neighboring subtrees of T . For a subtree T v of T, let the weight 
w(T v ) be the number of tokens contained in T v . Let active (T v ) = 1 if the current 
chain is in T v ; otherwise let active(T v ) = 0. 

Lemma 2 Let u, v e T be siblings in T such that T u and T v contain unfinished 
chains. Then \w(T u ) + active(T u ) — w(T v ) — active(T v )\ < 1. 



Proof: Let active(C) = 1 iff C is the current chain, and let active(C) = 0 
otherwise. Let token(C) be the number of tokens owned by C, and let g(C) — 
token (C)+active(C). Finally, let g(v) = J2c v(C)eT v 8(C) = w(T v )+active(T v ). 
We show by induction on the steps of the algorithm that \g{u) — g(v)\ < 1. 

The claim holds initially. For a subtree T v of T, the values w(T v ) and active(T v ) 
only change in lines 13, 16, and 19 of Balance and in lines 4 and 9 of procedure 
Relocate. Additionally, T changes in lines 10, 12, and 15. 

Note first that changes in T do not affect the invariant: Whenever T changes, 
v(C) receives a new child and C is not yet finished (or the algorithm has not yet 
determined that C is finished). Thus, the children of C are not yet in progress, i.e. 
they do not own any tokens by Proposition 1. Thus, the claim holds for any pair of 
children of v(C). 

We consider next all changes to w(T v ) and active(T v ). 

Line 13: Let C be the current chain before the execution of line 13. Note that 
token(C) increases by 1, active(C) becomes 0, token(C') decreases by 1, and 
active(C') becomes 1. Thus, g(C) and g{C), and, hence, g{v) is unchanged for 
every node v € T. 

Line 16: Note that (i) g(C) is unchanged by the same argument as for line 13, 
(ii) g{C) is unchanged, since token{C) and active(C') are unchanged, and (iii) 
g(Co) is increased by 1. Since Co only contributes to g(v(Co)) and f (Co) is the 
root of T, the claim holds. 
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Line 1 9 of Balance/Line 4 and 9 of Relocate: Let C be the current chain before 
the execution of line 3 or 7 and let C be the current chain afterwards. In line 3, the 
claim does not apply to T v (c), since T v (c) is finished. Thus, we are left with line 7. 
Note that active(C) drops to 0 and active(C) increases to 1. Thus, for every node 
v such that T v contains either both the parent and its child or neither the parent 
nor its child, g(v) is unchanged. The only remaining subtree is T v (q. Before the 
execution of line 7, for any sibling C' of C, w(T v (q) < w{T v (C)) < w(T v (q) + 1- 
Since active(C) = 0, \w(T v (c)) — w(T v (c')) + active(C) — active(C')\ < 1. ■ 

Lemma 3 Let C be a chain of color i, 1 < i < d, and, at the time when C is taken 
in progress, let u e T be the closest ancestor of v{C) that satisfies the following 
condition. The path from u to v(C) in T contains d nodes u\,U2, ■ ■ -,Ud such that 
each uj with 1 < j < d has a child Vj 

(a) T v . contains a node of color i ; and (b) v(C) £ T Vj . 

If there is no such ancestor u, then let u be v(Cq). Then there exists a T u -homogeneous 
C-completion. 

Proof: By assumption, the graph of explored edges is strongly connected, which 
implies that there exists a T v (c 0 ) -homogeneous C-completion. Suppose that there 
are d nodes u\, . . . , Ud satisfying (a) and (b). For j = 1, . . . , d, let C Uj be the 
chain corresponding to uj. If one of the nodes u\, ... ,uj, say ut, is of color i, 
then there is the following T Uk -homogeneous C-completion: Follow edges of color 
i until you reach the startnode of C Uk , then walk "down" in T Uk along ancestors of 
C to the startnode of C. 

Thus, we are left with the case that none of the nodes u\, . . . , Ud has color i. 
For j — 1, . . . , d, let C/j e T v . be a chain of color i such that no ancestor of C/j 
contained in T v . has color i. Let Cj t 2, • • • , ^ e tne ancestors of Cyj in T u .. 

More precisely, for k = 1, . . . , l(J) — 1, Cj^+i — parent(Cj^) and Cyjy) — C Uj 
is the chain corresponding to uj . 

Following the edges of color i gives a ^-homogeneous path from C to every 
chain C/j for 1 < j < d. We want to show that there exists a r H -homogenous 
path to a chain C/jq-) . We consider the following game on a d x max, l(j) grid, 
where for 1 < j < d, square (J, k) has the color of Cjj for 1 < k < and no 
color for k > l(j). Thus, all squares (j, 1) have color i and no other squares have 
color i. Initially all squares (j, 1) are checked, all other squares are unchecked. A 
square is checked if the robot can move to the startnode of the corresponding chain 
on a T^-homogeneous path. The rules of the game are: (Note that the startnode of 
Cy#-\ belongs to Cy^.) 
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• A square (j, k) of color V gets checked whenever there exists a square (j', k') 
of color V such that square (j',k' — 1) is checked and there exists a path of 
color-i' edges from the endnode ofCy# to the startnode ofCj^. 

• The game terminates when one of the squares (j, l(j)) is checked or when 
no more square can be checked. 

We will show that one of the squares (j, can be checked. This shows 
that there is a r H -homogeneous path from C to Cyjy) . Since uj is an ancestor 
of f(C), the same argument as above shows that there exists a T u -homogeneous 
C-completion. 

We employ the pigeon-hole principle: Initially, there are d checked squares 
(j, 1) for 1 < j < d and each square (j, 2) has a color i' ^ i. Since there are at 
most d — 1 other colors, there must be two squares (s, 2) and (t, 2) with the same 
color i' . Since the edges of color V form a chain, there is either a path from C Si 2 to 
C tt 2 or vice versa. Thus, one of the two squares can be checked. Inductively, there 
are d checked squares (J, k(J)) such that (j, k(J) + 1) is unchecked. None of the 
squares (j, k(j) + 1) has color i and thus, there must be two squares (j, k(j) + 1) 
with the same color, which leads to checking one of the two squares. The game 
continues until one of the squares ( j, l{j)) has been checked. ■ 

3.2.3 Counting the number of edge traversals 

Lemma 4 Each edge is traversed at most d times during executions of line 1 7 and 
at most 2d + 2 times during executions of line 18 of the Balance algorithm. 

Proof: Let e be an arbitrary edge and let C be the chain e belongs to. Every time 
e is traversed during an execution of line 17, a new token is placed on the graph. 
Since a total of d tokens are placed, the first statement of the lemma follows. 

Next we analyze executions of line 18. Let x and y be the tail and the head 
of e, i.e. e — (x, y). Let C 1 be the portion of C that consists of the path from the 
startnode of C to x. Similarly, let C 2 be the path from y to the endnode of C. 

Note that in line 18, edge e could only be traversed while nodes on C\ are 
unfinished if the robot gets stuck at a node y on C 1 , y having a token, and has 
to move to an unfinished node z on C 1 that lies before y. Since y holds a token, 
with C being the owner, y must have been the current node in a subphase when 
C was current chain. However, the node selection rule in line 1 8 ensures that this 
is impossible because z is unfinished. This also implies that in line 18, the robot 



19 



can always reach the first unfinished node on C by following C, without traversing 
closure(C). 

Thus, e is traversed for the first time in line 18 when all nodes on C 1 are fin- 
ished and the robot moves to the next unfinished node on C 2 . The edge e can be 
traversed again (a) if the robot gets stuck at a node on C 1 and moves to the next 
unfinished node of C, or (b) if the robot traverses C from its startnode, since pro- 
cedure Relocate returned chain C. Every time case (a) occurs, a token is removed 
from C , and this token cannot be placed again on C . Since there are only d to- 
kens, e can be traversed at most d more times in case (a) after it was traversed the 
first time in that line. Every time case (b) occurs, token{C)+active{C) increases 
by 1 , while no other step of the algorithm can decrease this value as long as C is 
unfinished. Thus, case (b) occurs at most d + 1 times. ■ 

Thus, it only remains to bound how often an edge is traversed in Relocate. 
A chain C is dependent on a chain C if C e T v (c) and closure{C) is not T u - 
homogeneous for any true descendant u of v{C). 

Lemma 5 For every chain C, there exist at most d 2logd+l chains C e T V {C) that 
are dependent on C. 

Proof: Let n,-(C) be the total number of chains of color i dependent on C. For a 
color i, 1 < i < d, and an integer 8, 1 < 8 < d, let 

Ni (8) — maxc{«, (C); T v (q contains at most 8 of the d tokens whenever 
active(T v(C) ) = U- 

We will show that for any 8, 1 < 8 < d, and any color i, (l)Ni(S) < d 2 Af,-(L<5/2J) 
and (2) iVi(l) = 1. This implies Ni(d) < d 2logd . Since £f =1 Ni(d) < d ■ d 2logd , 
the lemma follows. 

To prove (1), fix a color i and an integer 8. Consider a subtree T v (c) that 
contains at most 8 tokens when active(T v (c)) = L Out of all chains dependent on 
C, let C be the chain whose closure is computed last. We show that when the 
algorithm computes closure(C'), then the number of chains of color i that are 
already dependent on C is at most did — 1)A^([5/2J). Thus, n,-(C) < d(d — 
1)^(15/2.1) + 1 < d 2 Ni(\_8/2\). 

Let wi, U2, . . . , ui be the sequence of nodes (from lowest to highest) on the 
path from v(C') to v(C) such that every node uj, j = 1,2,..., I, has a child Vj 
with (a) T v . contains a node of color i, and (b) v(C) $ T Vj . By Lemma 3, I < d. 
Suppose that node Uj, 1 < j < d, has c(J) children, Vj t \, Vj t 2, ■ ■ ■ , ^j,c{j) with 
v e T Vjl . By condition (b), 2 < c(J) < d. 
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For fixed j and k > 2, we have to show: Up to the time when closure{C) 
is computed, whenever active(T Vjk ) = 1, then w(T Vjk ) < |_<V2J- Consider the 
point in time when closure{C) is computed. Since T v . , contains C', T v . , is un- 
finished. By Lemma 2, Balance distributes the tokens contained in T Uj evenly 
among the subtrees T Vj T Vj2 , . . . , T Vj c(j) that contain unfinished chains. Thus, for 
each unfinished T Vjk with k > 2, w(T Vjk ) was up to now at most 18/2} whenever 
active(T v . k ) = 1. For each finished T v . k , consider the last point of time when an 
unfinished chain of T Vjk becomes the current chain. Since Vj t \ exists, T v . , is unfin- 
ished and, by Lemma 2, w(T Vjk ) is up to this point in time at most [8/2j whenever 
active(T Vjk ) = 1. We conclude that up to the time when closure{C) is computed, 
T Vjk contains at most Af ; -(L<5/2J) chains of color i that can be dependent on the 
chain corresponding to Vj^, and, thus, can be dependent on C. Summing up, we 
obtain that T v (c) contains at most 

d c(j) 

X)X>i(L«/2J) < d{d - \)NiiV8/2\) 

y=l k=2 

chains of color i that can be dependent on C. 

Finally we show that yV,-(l) = 1. If a subtree T v (c) contains at most one token 
whenever active{T v (c)) = 1, then each node in T v (c) has only one child, by Propo- 
sition 1. Since T v (c) never branches, it can contain at most one chain of color i that 
is dependent on C. ■ 

Lemma 6 For every chain C, there exist at most d 2logd+l chains C e T v (q such 
that closure( C) uses edges of C. 

Proof: Let C be an arbitrary chain and let v € T be the node corresponding to C. 
We show that if a chain C € T v (q is not dependent on C, then closure{C) does 
not use edges of C. Lemma 6 follows immediately from Lemma 5. 

If a chain C e T v {p) is n °t dependent on C, then the path closure(C') is T u - 
homogeneous for a descendant u of v. Suppose that a T u -homogeneous path P 
would use edges of C. Let i be the color of C. Chain C does not belong to T u . 
Thus, after P has visited C, it may only traverse chains of color i until it reaches 
again a chain of color i that belongs to T u . Note that all chains of color i that 
are reachable from C via edges of color i must have been generated earlier that 
C. However, all chain in T u were generated later than C. We conclude that a 
T u -homogeneous path cannot use edges of C. ■ 

Lemma 7 For every chain C , there exist at most (d+2)d 2logd+1 chains C £ T v (q 
such that closure( C) uses edges of C. 
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Proof: A chain C needs a chain C if closure(C) uses edges of C and C is u-hard 
if closure{C) is T u -homogeneous, but not 7Vhomogeneous for any child v of u. 
For each chain C' there exists a unique node u of T such that C is w-hard. If C is 
dependent on chain C, then C is w-hard for an ancestor u of v{C). If C" is M-hard 
and v is a descendant of u and an ancestor of v(C'), then C" is dependent on C(v). 
To prove the lemma it suffices to show the following two claims: 

Claim 1: There are at most d 2Xogd+2 chains C g T v{C) such that C needs C and 
C is w-hard for some ancestor u of v{C). 

Claim 2: There are at most (d + l)d 2logd+2 chains C & T v(C) such that C needs 
C and C is w-hard for some node u that is not an ancestor of v{C). 

Proof of Claim 1: If C needs C, then C either does not yet exist or is un- 
finished when C is taken into progress. Consider the point in time when C is 
taken into progress. Let u\, ui, . . . , ui be the ancestors of v(C) in T that fulfill the 
following conditions: Each node uj has a child vj such that (a) T v . contains unfin- 
ished chains, and (b) v(C) ^ T v .. Thus, every chain that needs C lies in one of the 
subtrees T Vj . Note that I < d, since by Proposition 1, every subtree that contains 
an unfinished chain not equal to the current chain must own a token. Assume C 
belongs to T Vj . Since Uj is the least common ancestor of v{C) and v{C), and C 
is w-hard for an ancestor u of v(C), C is dependent on C(uj). Since by Lemma 
5 there are at most d 2logd+l chains that are dependent on C(uj), there can be at 
most I ■ d 2logd+l < d 2logd+2 chains C $ T v(C) that need C and are M -hard for an 
ancestor of v(C). ■ 

Proof of Claim 2: Let i be the color of C. Let us denote the concatenation of 
all chains of color i as the path of color i. Note that the path of color i introduces 
a linear order on the chains of color i. We say a chain C lies between two other 
chains on the path of color i if C is not equal to one of the chains and lies between 
them in the linear order. We define first the nearest predecessor of a chain. Then 
we show (1) that for each chain C g T v (q that needs C and is M-hard for some 
node u that is not an ancestor of v{C), there exists a chain Ci of color i such that 

• C lies on the path of color i between Ci and its nearest predecessor, and 

• C\ fulfills the conditions of Claim 1, i.e., C' needs C\ and u is an ancestor 
of v(d). 

We show next (2) that there exist at most d chains Ci of color i for which C lies 
on the path of color i between Ci and its nearest predecessor. By Claim 1 and 
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Lemma 6, for each C\ there exist at most (d+ l)d 2logd+l closures that are hard for 
an ancestor of v(C\). It follows that there are at most d(d + I) ■ d 2logd+l chains 
C that need C and are w-hard for some node u that is not an ancestor of v(C). 

Consider the point in time when C is taken into progress. Let a(C) be the 
closest ancestor of v{C) such that T a (C) contains a node of color i that is not equal 
to v(C). The nearest predecessor of C is the chain C ^ C of color i that was 
taken into progress most recently in T a (C)- 

(1) The closure of C introduces an order on the chains belonging to it. Let 
C\ be the last chain of T u before C on closure{C) and let Ci be the first chain of 
T u after C on closure(C), i.e. C lies on the path of color-/ edges between C\ and 
C2. We show below that the path of color-/ edges between Ci and C2 is contained 
in the path of color-/ edges between Ci and its nearest predecessor. This implies 
that C lies on the path of color-/ edges between C\ and its nearest predecessor and 
completes the proof of (1). 

Since T u is a subtree that contains Ci and C2, i.e. Ci and another chain of 
color / that was taken into progress before Ci , T u also must contain the nearest 
predecessor of Ci . Following the path of color-/ edges from Ci , C2 is the first chain 
of T u that is encountered. Thus, the color-/ path between C\ and C2 is contained 
in the color-/ path between C\ and its nearest predecessor. 

(2) We want to bound the number of color-/ chains C\ such that C lies on the 
path of color / between Ci and its nearest predecessor. Obviously, Ci was created, 
after C was taken in progress (otherwise, Ci would have been appended to C). 
Consider the point in time when C is taken into progress. Let C\, . . . , Ci be the 
chains that are parents of fresh chains. All chains created afterwards must belong 
to T V (C) or to T v ^ { y . . . , T v (c y Note (a) that for no color-/ chain in T V (C), C can 
lie on the color-/ path between the chain and its nearest predecessor. Note (b) that 
for k = 1, . . . , /, only for the color-/ chain in T v ^ created first after C was 
taken into progress, C can lie between C (i) and its nearest predecessor. The nearest 
predecessor of every color-/ chain D created later belongs to and was created 
after C. Thus, C does not lie on the color-/ path between D and its predecessor. 
Thus, at most I chains exists such that C lies on the color-/ path between the chain 
and its predecessor. By Proposition 1, 1 < d. ■ 

Theorem 2 Using the Balance algorithm, the robot explores an unknown graph 
with deficiency d and traverses each edge at most (d + l) 6 d 2logd times. 

Proof: Let e be an arbitrary edge of chain C. Edge e is traversed for the first 
time when it is explored during an execution of line 5 of the Balance algorithm. 
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By Lemma 4, it can be traversed 3 d + 2 times during executions of lines 17 and 
18. By Lemmas 6 and 7, e belongs to at most d 2logd+l + (d + 2) < i 21o s (,+2 paths 
closure(C'). We show that each path closure(C') is traversed at most d(d + 1) 
times. The path closure{C) is used at most d times during an execution of line 2 
of Relocate, since each time a token is removed from the finished chain C' . The 
path closure{C) can also be used at most d 2 times in line 4 of Relocate, since each 
time a token is removed from the finished subtree T v (c") of a child C" of C. 

Finally, the edge e might be traversed d(d + 1) times in line 9 of Relocate. 
When e is traversed in line 9, then (i) either the robot had moved to Co after the 
introduction of a new token (line 16) or (ii) there exists an ancestor u of v(C) with 
a child x such that the robot was stuck at a node in T x and T x is finished. Thus, by 
going "up" the tree T in lines 3-5, the robot reached u. Case (i) occurs at most d 
times. When C becomes the current chain for the first time, let u\, . . . , m be the 
ancestors of v{C) such that each uj has a child Vj with (a) T v . contains unfinished 
chains, and (b) v T Vj . By Proposition 1, the nodes u\, . . . , w/ can have a total of 
d children satisfying (a) and (b). Since each subtree rooted at one of these children 
can contain at most d tokens, case (ii) occurs at most d 2 times. 

Thus, edge e is traversed at most 

1 + 3d + 2 + did + l)(d 2logd+l + (d + 2)d 2logd+2 )+d(d + l)<(d+ l) 5 d 2logd 

times. Multiplying the bound by d to account for restarts shows the theorem. ■ 

The total number of edge traversals used by Balance is also 0(mm{mn, dn 2 + 
m}), where n is the number of nodes in the graph. It is not hard to show that an 
upper bound of 0(min{mn, dn 2 + m}) is achieved by any exploration algorithm 
satisfying the following two properties: (1) When the robot gets stuck, it moves on 
a cycle-free path to some, i.e. arbitrary, node with new outgoing edges. (2) When 
the robot is not relocating, it always traverses new edges whenever possible. 

We show that any exploration algorithm satisfying (1) and (2) gets stuck at 
most min{m, dn} times. The bound follows because, by Property (1), at most n 
edges are traversed during each relocation. Obviously, a robot gets stuck at most 
m times. For the proof of the second bound, let in u {v) and out u {v) be the num- 
ber of unvisited incoming and unvisited outgoing edges of v, respectively. Let 
def{v) = min{0, in u {v) — out u {v)}. We show inductively that Y2 veG def(v) < d. 
This implies that, for every node v, whenever the robot explores the last unvisited 
edge out of v, there are at most d unvisited incoming edges at v. Thus the robot 
gets stuck at most d times at any node v. Summing over all nodes in G gives the 
desired bound of dn. 
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The inequality J^veG def(v) < d holds intitially. The invariant is maintained 
whenever the robot relocates from a node y, where it got stuck, to some node z 
with new outgoing edges because only visited edges are traversed. Whenever the 
robot starts a new exploration at a node z, visits a sequence of new edges and gets 
stuck at a node x, def(z) increases by at most 1, def(x) decreases by 1 while at no 
other node, the def-value changes. 

4 A tight lower bound for the Balance algorithm and mod- 
ifications 

In this section we give first a lower bound for the Balance algorithm and afterwards 
we give lower bounds for modifications of Balance. 

Theorem 3 For every d > I, there exists a graph G of deficiency d that is explored 
by Balance using d a(logd) m edge traversals. 

Proof: We show that there exists a graph G = (V , E) and an edge e e E that 
is traversed d a(logn ^ times while Balance explores G. The theorem follows by 
replacing e by a path of 0(m) edges. We show the bound or d being a power of 5. 
The bound for all values of d follows by "rounding" down to the largest power of 

5 smaller than d. 

The graph is a union of chains C, each of which consists of three edges, a 
startnode, an endnode and two interiomodes v l {C) and v 2 (C). The interior nodes 
belong to exactly one chain and have up to one additional outgoing edge. We 
describe G, see also Figure 9. Graph G contains (a) a cycle Co that starts and 
ends in a node v (Balance is started at v and finds Co during Phase 1) and (b) a 
recursively defined problem P d attached to Co. 

In the following let 8, 1 < 8 < d, be a power of 5. A problem P s , for any 
integer 8 > 5, is a subgraph that has two incoming edges whose startnodes do not 
belong to P s but whose endnodes do, and 8 + 1 outgoing edges whose startnodes 
belong to P s but whose endnodes do not. A problem P l has one incoming and 
one outgoing edge. In the case of P d , the two incoming edges start at v [ (Cq) and 
u 2 (Co), respectively; d outgoing edges point to v and one outgoing edge points to 
ui(Cb). 

For the definition of P s we also need problems Q s . These problems are iden- 
tical to P s except that, for 8 > 1, Q s has exactly 8+1 incoming edges. 
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Figure 9: The graph G 



A problem P consists of a single chain; the first edge of the chain represents 
an incoming edge and the last edge represents an outgoing edge. The interior nodes 
have no additional outgoing edges. A problem Q l is identical to P 1 . 

For 8 > 5, let y — 8/5. Problem P s consists of 3y 2 chains C? k , 1 < i < y, 

1 < k < 3y, as well as y chains Df and y recursive subproblems , 1 < i < 
y — 1 , and Py . 

These components are assembled as follows. One of the incoming edges of P s 
is the first edge of C\ v We assume that vi(Cq) is the startnode of cf^. Node 
v\cl k ) is the startnode of Cj k+l , 1 < i < y, 1 < it < 3y - 1. Node V(C£ 3) ,) 
is the startnode of Cj +l p 1 < i < y — 1. The last edge of C\ k , 1 < k < 3y, is 
an outgoing edge of P . The endnode of is equal to the startnode of C\_ x k , 

2 < i < y and 1 < k < 3y. Note that the last edge of C\ { hence is an outgoing 
edge of P s . Nodes v 2 (Cj k ), 1 < i < y, 1 < k < 3y - 1, have no additional 
outgoing edge but nodes v 2 (Cj 3y ), 1 < i < y — 1, do. Chain C^ 3y has no 
additional outgoing edges. 

The second incoming edge of P s is the first edge of a chain D\ and, for 2 < 
i < y, the edge leaving v 2 (Cj_ l 3y ) is the first edge of Dj . For 1 < i < y, the 
last edge of D?' is an outgoing edge of P s . If 5 = 5, then the first interior node of 
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the chain D? — D\ has an additional outgoing edge pointing into a problem P l . 
If 8 > 5, then the two interior nodes of Dj , 1 < i < y, each have an additional 
outgoing edge. For 1 < i < y — 1, these two edges point into Q] and, for i — y, 
they point into Py . 

If 8 — 5, then the outgoing edge of the only subproblem P l is an outgoing 
edges of P s = P 5 . If 8 > 5, the problems <2f , 1 < i < y - 1, and Py each 
have y + 1 outgoing edges. For Q\,y these edges are also outgoing edges of P s 
and one edge points to the interior node of D\ that is the startnode of C\ v For 
2 < i < y — 1, exactly y — 1 edges leaving <2f point into Q v i _ l such that every 
node that has I more outgoing than incoming edges, for I > 0, receives I edges. 
One outgoing edge points to the interior nodes of Df_ x that does not get an edge 
from Q\_i and the remaining edge points to the interior node of Dj that is the 
startnode of C\ v In the same way the edges leaving Py are connected with Q V Y _ V 
D r y _ l wdDy'.' 

We identify the sources of P s , i.e. the nodes having higher indegree than out- 
degree. At each source, indegree and outdegree differ by 1 . The startnodes of the 
chains Dj , 2 < i < y , and k , 1 < k < 3y, represent a total of Ay — 1 sources. 
One interior node of Dy represents a source. Finally, the subproblem Py contains 
y — 1 sources. 

A problem Q s , 5 > 5, is the same as P s , except that the subproblem Py is 
replaced by a problem Qy. As mentioned before, a problem Q s receives 8 — 1 
additional incoming edges. These edges point to the nodes that represent sources 
in P s . 

We analyze the number of edge traversals used by Balance on G. Consider a 
problem P s , 8 > 5, and let y — 8/5. When Balance generates the strand of chains 
C v iX , . . ., Cj 3y , for some 1 < i < y, this strand contains 3y > y + l tokens. Since 

and the subproblem attached to it contain y tokens Balance does not explore 
the unvisited edges out of Cj 3y before the subproblem attached to Dj is finished. 
In the same way we can argue for a problem Q s . 

Let N(8) be the number of times the following event happens while Balance 
works on a problem P s or Q s : Balance generates a new chain, gets stuck and can- 
not reach a node with new outgoing edges by using only edges in P s resp. Q s . 
Problem P s contains y subproblems Q\, . . . , Q v y _ i and Py . Every time Balance 
gets stuck in one of these subproblems and has to leave it in order to resume ex- 
ploration, it also has to leave P s . This is because of the following facts: (1) When 
Balance explores Q\ , 1 < i < y — 1, or Py , the subproblems Q\ , . . . , <2f_j 
resp. Q v x , Q v y _ l are already finished. (2) The chains D\, ...,D V V ensure that 
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Balance cannot reach any chain Cf k , l<i<y,l<k<3y, from where the 
unfinished chains in P s can be reached. Again the same holds for a problem Q s . 
Thus, for 8 > 5, N(S) > yN(y) = (S/5)N(S/5). Since N(8) = 1, for 8 = 1, we 
obtain N(d) — d nt ^ ogd \ Finally, consider the edge e on Co that leaves v. Balance 
must traverse e at least N(d) — d n(logd) times. ■ 

We also modified the Balance algorithm by relocating to other nodes with new 
outgoing edges. Replace the choice of C\ in line 7 of by one of the following rules. 

Round Robin: Let Q be the chain among C\, ...,Q that was selected least 
often in any execution of line 7. 

Cheapest Subtree: Let be the chain among C\, . . . , Q, such that T v (c k ) 
contains the fewest number of dependent chains with respect to the current chain. 

Theorem 4 For Round Robin and for Cheapest Subtree and for all d > 1, there 
exist graphs of deficiency d that require d Q(logd) m edge traversals. 

Proof: The proof is identical to that of Generalized-Greedy in Theorem 1 . ■ 
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5 Appendix 

Lemma 8 After at most d restarts of the Balance algorithm, whenever a new sink 
si, 1 < i < d, was discovered, we can ensure that the subgraph of explored edges 
is always strongly connected. 

Proof: Whenever the Balance algorithm gets stuck in line 14 and there is no path 
to s using edges that have been traversed before, then the algorithm stops and has to 
be restarted as follows. Let s\, st-i be the already discovered sinks. Assume 
the robot gets stuck at y on a path P that does not belong to any other chain created 
since the last restart. Whenever this happens, y is a newly discovered sink s^ and 
must have occurred on P before (each node has degree > 2). Take the cycle C 
between the two occurrences of s& on P and restart the algorithm with current 
node y and current chain C with the following modification: All edges traversed 
before the restart are marked as k — 1 -visited. We show below that there exists 
a path of k — 1 -visited edges from all previously visited sinks to y. Whenever 
the algorithm started at s# encounters an already visited sink 57 , i < k, then the 
algorithm traverses the k — 1 -visited edges on the path from 57 to s^ as required in 
lines 16 and 17 of the algorithm, i.e., the algorithm does not get stuck at 57, i < k. 

Thus, whenever the modified algorithm restarts, it has discovered a new sink 
st- Hence, after at most d restarts, all sinks have been discovered and there is a 
path from the every sink 57 with i < k to . 

We show inductively that there exists a path from all previously explored sinks 
to y. Obviously the claim holds initially. Whenever y — Sj for j > 1, then obvi- 
ously there exists a path from sj-\ to Sj, since this is how the previous algorithm 
got stuck. Thus, by transitivity of the reachability relation, all previously visited 
sinks can reach sj . ■ 
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